xAI launches Grok Imagine, an AI tool generating images and short videos

xAI launches Grok Imagine marks a staged release of a new creative generator inside X. The feature turns text prompts into fast images and then animates them into 15‑second videos with native audio.

The release reached SuperGrok and Premium+ users on iOS first. The interface is built for speed and ease, with results appearing in seconds and an auto‑scroll feed that keeps producing output.

Grok Imagine adds a notable option: a “Spicy” mode that can create NSFW content while still applying moderation and blurs for restricted requests. Early tries show firmer guardrails for celebrity faces.

Elon Musk called the rollout a “0.1 beta” and invited users to help refine the model. The tool aims to improve daily in image quality, speed, and features as more user feedback pours in.

xAI launches Grok Imagine: availability, access, and what’s live now

At launch, the creative generator is available to a small group of paying subscribers on X. SuperGrok and Premium+ subscribers get first access inside the iOS app, making the feature an exclusive perk for premium accounts.

Who can use it

Access is limited to eligible subscribers. These users can create images from text prompts and then work within the app to turn a chosen image into a short clip.

Platform rollout

The iOS app is the lead client, with imagine live for the eligible base. Early access on Android has begun; reports say image generation works there but full video generation is not yet available. Web availability has not been announced.

What it generates today

The tool supports text-to-image and an image-to-video flow that produces 15‑second videos with native audio. The experience emphasizes speed and a simple generate→select→animate process rather than direct text-to-video generation.

“0.1 beta” — Elon Musk, describing the model and encouraging feedback to refine daily.

Inside Grok Imagine: features, modes, and the “Spicy” NSFW option

Users interact with a two-step flow: generate an image, then turn that image into a short clip with a few taps. The core feature links fast text prompts to an image-first animation pipeline rather than direct text-to-video creation.

Text-to-image, styles, and voice prompts

The generator offers style controls for photorealism, anime, and illustration looks. You can type prompts or use voice input to steer results. This lowers friction and makes image generation more accessible to different users.

Image-to-video modes

Video creation needs a reference image, either uploaded or produced in the app. Four animation modes — Custom, Normal, Fun, and Spicy mode — change how motion and sound are applied during video generation.

NSFW, moderation, and celebrity limits

The spicy mode permits adult content, including limited partial female nudity, while moderation blurs or blocks more explicit requests. The model also applies extra restrictions for public figures. Tests attempting extreme depictions of certain celebrities instead returned safer alternatives, showing how content filters and restrictions shape outcomes.

“0.1 beta” — Elon Musk, describing the model and encouraging feedback to refine daily.

How Grok Imagine compares and early reactions from users

Observers noted that this tool favors an image-first workflow rather than full text-to-video generation. In contrast to rival tools like Google Veo and OpenAI Sora, it animates a chosen picture into a brief clip instead of producing direct text-driven video. That difference changes creative workflows and moderation needs.

Contrast with Google Veo and OpenAI Sora

Veo and Sora aim for direct text-to-video and tighter safety filters. This video generator opts for an image-to-video pipeline and a permissive NSFW mode, which has led to wide experimentation and mixed reactions from users.

Quality today: uncanny valley and areas for improvement

Early posts show striking outputs — photorealistic adults and anime scenes — but critics flag realism issues. Faces can look waxy and motion sometimes feels stiff, creating an uncanny valley that the model needs to fix.

“0.1 beta” — Elon Musk

Conclusion

Conclusion

Grok Imagine ships as a bold, early feature that fast-tracks image generation and 15‑second videos inside the app. The model uses an image-first flow, supports voice and style prompts, and adds a permissive spicy mode with moderation and celebrity limits.

Early outputs show creative potential alongside realism issues like waxy skin and stiff motion. Elon Musk framed the release as a 0.1 beta and asked for user feedback to refine the tool.

For creators, the feature makes experimenting with short video simple. For the platform, it strengthens the subscription value proposition while raising questions experts will watch as the model, guardrails, and access evolve.